ProPOSEL: a human-oriented prosody and PoS English lexicon for machine-learning and NLP

نویسندگان

Claire Brierley

Eric Steven Atwell

چکیده

ProPOSEL is a prosody and PoS English lexicon, purpose-built to integrate and leverage domain knowledge from several well-established lexical resources for machine learning and NLP applications. The lexicon of 104049 separate entries is in accessible text file format, is human and machine-readable, and is intended for open source distribution with the Natural Language ToolKit. It is therefore supported by Python software tools which transform ProPOSEL into a Python dictionary or associative array of linguistic concepts mapped to compound lookup keys. Users can also conduct searches on a subset of the lexicon and access entries by word class, phonetic transcription, syllable count and lexical stress pattern. ProPOSEL caters for a range of different cognitive aspects of the lexicon  .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring complex vowels as phrase break correlates in a corpus of English speech with proPOSEL, a prosody and POS English lexicon

Real-world knowledge of syntax is seen as integral to the machine learning task of phrase break prediction but there is a deficiency of a priori knowledge of prosody in both rule-based and data-driven classifiers. Speech recognition has established that pauses affect vowel duration in preceding words. Based on the observation that complex vowels occur at rhythmic junctures in poetry, we run sig...

متن کامل

ProPOSEC: A Prosody and PoS Annotated Spoken English Corpus

We have previously reported on ProPOSEL, a purpose-built Prosody and PoS English Lexicon compatible with the Python Natural Language ToolKit. ProPOSEC is a new corpus research resource built using this lexicon, intended for distribution with the Aix-MARSEC dataset. ProPOSEC comprises multi-level parallel annotations, juxtaposing prosodic and syntactic information from different versions of the ...

متن کامل

ProPOSEL: A Prosody and POS English Lexicon for Language Engineering

ProPOSEL is a prototype prosody and PoS (part-of-speech) English lexicon for Language Engineering, derived from the following language resources: the computer-usable dictionary CUVPlus, the CELEX-2 database, the Carnegie-Mellon Pronouncing Dictionary, and the BNC, LOB and Penn Treebank PoS-tagged corpora. The lexicon is designed for the target application of prosodic phrase break prediction but...

متن کامل

Prosody resources and symbolic prosodic features for automated phrase break prediction

It is universally recognised that humans process speech and language in chunks, each meaningful in itself. Any two renditions or assimilations of a given sentence will exhibit similarities and discrepancies in chunking, where speakers and readers use pauses and inflections to mark phrase breaks. This thesis reviews deterministic and stochastic approaches to phrase break prediction, plus dataset...

متن کامل

Complex Vowels as Boundary Correlates in a Multi-Speaker Corpus of Spontaneous English Speech

We have found empirical evidence of a correlation in English between words containing complex vowels (diphthongs and triphthongs) and ‘gold-standard’ phrase break annotations in datasets as apparently different as seventeenth-century verse and a Reith lecture transcript on economics from the late twentieth-century. Spontaneous speech in the form of BBC radio news reportage from the 1980s again ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

ProPOSEL: a human-oriented prosody and PoS English lexicon for machine-learning and NLP

نویسندگان

چکیده

منابع مشابه

Exploring complex vowels as phrase break correlates in a corpus of English speech with proPOSEL, a prosody and POS English lexicon

ProPOSEC: A Prosody and PoS Annotated Spoken English Corpus

ProPOSEL: A Prosody and POS English Lexicon for Language Engineering

Prosody resources and symbolic prosodic features for automated phrase break prediction

Complex Vowels as Boundary Correlates in a Multi-Speaker Corpus of Spontaneous English Speech

عنوان ژورنال:

اشتراک گذاری